Overview
Brought to you by YData
Dataset statistics
| Number of variables | 6 |
|---|---|
| Number of observations | 14227140 |
| Missing cells | 9 |
| Missing cells (%) | < 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 5.3 GiB |
| Average record size in memory | 399.9 B |
Variable types
| Text | 6 |
|---|
nconst has unique values | Unique |
Reproduction
| Analysis started | 2025-03-04 03:58:42.429407 |
|---|---|
| Analysis finished | 2025-03-04 04:05:09.957605 |
| Duration | 6 minutes and 27.53 seconds |
| Software version | ydata-profiling vv4.12.2 |
| Download configuration | config.json |
Variables
nconst
Text
Unique 
| Distinct | 14227140 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 901.0 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.408556 |
| Min length | 9 |
Unique
| Unique | 14227140 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | nm0000001 |
|---|---|
| 2nd row | nm0000002 |
| 3rd row | nm0000003 |
| 4th row | nm0000004 |
| 5th row | nm0000005 |
| Value | Count | Frequency (%) |
| nm0000012 | 1 | < 0.1% |
| nm9993719 | 1 | < 0.1% |
| nm0000001 | 1 | < 0.1% |
| nm0000002 | 1 | < 0.1% |
| nm9993686 | 1 | < 0.1% |
| nm9993687 | 1 | < 0.1% |
| nm9993688 | 1 | < 0.1% |
| nm9993689 | 1 | < 0.1% |
| nm9993690 | 1 | < 0.1% |
| nm9993691 | 1 | < 0.1% |
| Other values (14227130) | 14227130 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 16110889 | |
| n | 14227140 | |
| m | 14227140 | |
| 0 | 10346563 | |
| 3 | 10265509 | |
| 2 | 10260491 | |
| 4 | 10187127 | |
| 5 | 10155167 | |
| 6 | 10087140 | |
| 7 | 9360859 | |
| Other values (2) | 18628819 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 133856844 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16110889 | |
| n | 14227140 | |
| m | 14227140 | |
| 0 | 10346563 | |
| 3 | 10265509 | |
| 2 | 10260491 | |
| 4 | 10187127 | |
| 5 | 10155167 | |
| 6 | 10087140 | |
| 7 | 9360859 | |
| Other values (2) | 18628819 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 133856844 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16110889 | |
| n | 14227140 | |
| m | 14227140 | |
| 0 | 10346563 | |
| 3 | 10265509 | |
| 2 | 10260491 | |
| 4 | 10187127 | |
| 5 | 10155167 | |
| 6 | 10087140 | |
| 7 | 9360859 | |
| Other values (2) | 18628819 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 133856844 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16110889 | |
| n | 14227140 | |
| m | 14227140 | |
| 0 | 10346563 | |
| 3 | 10265509 | |
| 2 | 10260491 | |
| 4 | 10187127 | |
| 5 | 10155167 | |
| 6 | 10087140 | |
| 7 | 9360859 | |
| Other values (2) | 18628819 |
primaryName
Text
| Distinct | 10907648 |
|---|---|
| Distinct (%) | 76.7% |
| Missing | 9 |
| Missing (%) | < 0.1% |
| Memory size | 990.9 MiB |
Length
| Max length | 105 |
|---|---|
| Median length | 78 |
| Mean length | 13.510672 |
| Min length | 1 |
Unique
| Unique | 9798529 ? |
|---|---|
| Unique (%) | 68.9% |
Sample
| 1st row | Fred Astaire |
|---|---|
| 2nd row | Lauren Bacall |
| 3rd row | Brigitte Bardot |
| 4th row | John Belushi |
| 5th row | Ingmar Bergman |
| Value | Count | Frequency (%) |
| david | 134327 | 0.5% |
| john | 126293 | 0.4% |
| michael | 125434 | 0.4% |
| james | 87749 | 0.3% |
| de | 81558 | 0.3% |
| paul | 70434 | 0.2% |
| robert | 69162 | 0.2% |
| daniel | 68951 | 0.2% |
| chris | 68371 | 0.2% |
| thomas | 62607 | 0.2% |
| Other values (2255244) | 28724169 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 19993595 | 10.4% |
| e | 16218300 | 8.4% |
| 15391924 | 8.0% | |
| n | 13140925 | 6.8% |
| i | 13045417 | 6.8% |
| r | 12022862 | 6.3% |
| o | 10491560 | 5.5% |
| l | 8834260 | 4.6% |
| s | 6982301 | 3.6% |
| t | 6211561 | 3.2% |
| Other values (198) | 69885395 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 192218100 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 19993595 | 10.4% |
| e | 16218300 | 8.4% |
| 15391924 | 8.0% | |
| n | 13140925 | 6.8% |
| i | 13045417 | 6.8% |
| r | 12022862 | 6.3% |
| o | 10491560 | 5.5% |
| l | 8834260 | 4.6% |
| s | 6982301 | 3.6% |
| t | 6211561 | 3.2% |
| Other values (198) | 69885395 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 192218100 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 19993595 | 10.4% |
| e | 16218300 | 8.4% |
| 15391924 | 8.0% | |
| n | 13140925 | 6.8% |
| i | 13045417 | 6.8% |
| r | 12022862 | 6.3% |
| o | 10491560 | 5.5% |
| l | 8834260 | 4.6% |
| s | 6982301 | 3.6% |
| t | 6211561 | 3.2% |
| Other values (198) | 69885395 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 192218100 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 19993595 | 10.4% |
| e | 16218300 | 8.4% |
| 15391924 | 8.0% | |
| n | 13140925 | 6.8% |
| i | 13045417 | 6.8% |
| r | 12022862 | 6.3% |
| o | 10491560 | 5.5% |
| l | 8834260 | 4.6% |
| s | 6982301 | 3.6% |
| t | 6211561 | 3.2% |
| Other values (198) | 69885395 |
birthYear
Text
| Distinct | 559 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 801.7 MiB |
Length
| Max length | 4 |
|---|---|
| Median length | 2 |
| Mean length | 2.0900079 |
| Min length | 1 |
Unique
| Unique | 172 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 1899 |
|---|---|
| 2nd row | 1924 |
| 3rd row | 1934 |
| 4th row | 1949 |
| 5th row | 1918 |
| Value | Count | Frequency (%) |
| n | 13586838 | |
| 1980 | 10261 | 0.1% |
| 1981 | 9967 | 0.1% |
| 1979 | 9878 | 0.1% |
| 1982 | 9841 | 0.1% |
| 1978 | 9740 | 0.1% |
| 1983 | 9473 | 0.1% |
| 1984 | 9450 | 0.1% |
| 1977 | 9170 | 0.1% |
| 1985 | 9111 | 0.1% |
| Other values (549) | 553411 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| \ | 13586838 | |
| N | 13586838 | |
| 1 | 719330 | 2.4% |
| 9 | 718202 | 2.4% |
| 8 | 209052 | 0.7% |
| 7 | 159796 | 0.5% |
| 6 | 137945 | 0.5% |
| 2 | 128926 | 0.4% |
| 4 | 125781 | 0.4% |
| 5 | 124573 | 0.4% |
| Other values (2) | 237554 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 29734835 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| \ | 13586838 | |
| N | 13586838 | |
| 1 | 719330 | 2.4% |
| 9 | 718202 | 2.4% |
| 8 | 209052 | 0.7% |
| 7 | 159796 | 0.5% |
| 6 | 137945 | 0.5% |
| 2 | 128926 | 0.4% |
| 4 | 125781 | 0.4% |
| 5 | 124573 | 0.4% |
| Other values (2) | 237554 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 29734835 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| \ | 13586838 | |
| N | 13586838 | |
| 1 | 719330 | 2.4% |
| 9 | 718202 | 2.4% |
| 8 | 209052 | 0.7% |
| 7 | 159796 | 0.5% |
| 6 | 137945 | 0.5% |
| 2 | 128926 | 0.4% |
| 4 | 125781 | 0.4% |
| 5 | 124573 | 0.4% |
| Other values (2) | 237554 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 29734835 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| \ | 13586838 | |
| N | 13586838 | |
| 1 | 719330 | 2.4% |
| 9 | 718202 | 2.4% |
| 8 | 209052 | 0.7% |
| 7 | 159796 | 0.5% |
| 6 | 137945 | 0.5% |
| 2 | 128926 | 0.4% |
| 4 | 125781 | 0.4% |
| 5 | 124573 | 0.4% |
| Other values (2) | 237554 | 0.8% |
deathYear
Text
| Distinct | 502 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 801.0 MiB |
Length
| Max length | 4 |
|---|---|
| Median length | 2 |
| Mean length | 2.0338514 |
| Min length | 2 |
Unique
| Unique | 175 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 1987 |
|---|---|
| 2nd row | 2014 |
| 3rd row | \N |
| 4th row | 1982 |
| 5th row | 2007 |
| Value | Count | Frequency (%) |
| n | 13986314 | |
| 2021 | 7607 | 0.1% |
| 2022 | 7246 | 0.1% |
| 2020 | 7223 | 0.1% |
| 2023 | 7004 | < 0.1% |
| 2024 | 6284 | < 0.1% |
| 2019 | 6100 | < 0.1% |
| 2018 | 5866 | < 0.1% |
| 2016 | 5760 | < 0.1% |
| 2017 | 5741 | < 0.1% |
| Other values (492) | 181995 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| \ | 13986314 | |
| N | 13986314 | |
| 2 | 195513 | 0.7% |
| 0 | 195308 | 0.7% |
| 1 | 192239 | 0.7% |
| 9 | 161615 | 0.6% |
| 8 | 46958 | 0.2% |
| 7 | 40013 | 0.1% |
| 6 | 35292 | 0.1% |
| 4 | 33918 | 0.1% |
| Other values (2) | 62405 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 28935889 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| \ | 13986314 | |
| N | 13986314 | |
| 2 | 195513 | 0.7% |
| 0 | 195308 | 0.7% |
| 1 | 192239 | 0.7% |
| 9 | 161615 | 0.6% |
| 8 | 46958 | 0.2% |
| 7 | 40013 | 0.1% |
| 6 | 35292 | 0.1% |
| 4 | 33918 | 0.1% |
| Other values (2) | 62405 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 28935889 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| \ | 13986314 | |
| N | 13986314 | |
| 2 | 195513 | 0.7% |
| 0 | 195308 | 0.7% |
| 1 | 192239 | 0.7% |
| 9 | 161615 | 0.6% |
| 8 | 46958 | 0.2% |
| 7 | 40013 | 0.1% |
| 6 | 35292 | 0.1% |
| 4 | 33918 | 0.1% |
| Other values (2) | 62405 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 28935889 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| \ | 13986314 | |
| N | 13986314 | |
| 2 | 195513 | 0.7% |
| 0 | 195308 | 0.7% |
| 1 | 192239 | 0.7% |
| 9 | 161615 | 0.6% |
| 8 | 46958 | 0.2% |
| 7 | 40013 | 0.1% |
| 6 | 35292 | 0.1% |
| 4 | 33918 | 0.1% |
| Other values (2) | 62405 | 0.2% |
| Distinct | 23206 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 938.9 MiB |
Length
| Max length | 67 |
|---|---|
| Median length | 64 |
| Mean length | 12.197141 |
| Min length | 2 |
Unique
| Unique | 5599 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | actor,miscellaneous,producer |
|---|---|
| 2nd row | actress,soundtrack,archive_footage |
| 3rd row | actress,music_department,producer |
| 4th row | actor,writer,music_department |
| 5th row | writer,director,actor |
| Value | Count | Frequency (%) |
| n | 2783996 | |
| actor | 2516928 | |
| actress | 1615254 | 11.4% |
| miscellaneous | 822004 | 5.8% |
| producer | 487849 | 3.4% |
| camera_department | 439594 | 3.1% |
| art_department | 265486 | 1.9% |
| writer | 230821 | 1.6% |
| sound_department | 222345 | 1.6% |
| composer | 174667 | 1.2% |
| Other values (23196) | 4668196 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 19906824 | |
| e | 19883937 | |
| t | 18424334 | |
| a | 16519238 | |
| c | 12799795 | 7.4% |
| o | 11508296 | 6.6% |
| s | 10609925 | 6.1% |
| n | 7763103 | 4.5% |
| m | 7670612 | 4.4% |
| i | 7291113 | 4.2% |
| Other values (16) | 41153260 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 173530437 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 19906824 | |
| e | 19883937 | |
| t | 18424334 | |
| a | 16519238 | |
| c | 12799795 | 7.4% |
| o | 11508296 | 6.6% |
| s | 10609925 | 6.1% |
| n | 7763103 | 4.5% |
| m | 7670612 | 4.4% |
| i | 7291113 | 4.2% |
| Other values (16) | 41153260 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 173530437 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 19906824 | |
| e | 19883937 | |
| t | 18424334 | |
| a | 16519238 | |
| c | 12799795 | 7.4% |
| o | 11508296 | 6.6% |
| s | 10609925 | 6.1% |
| n | 7763103 | 4.5% |
| m | 7670612 | 4.4% |
| i | 7291113 | 4.2% |
| Other values (16) | 41153260 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 173530437 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 19906824 | |
| e | 19883937 | |
| t | 18424334 | |
| a | 16519238 | |
| c | 12799795 | 7.4% |
| o | 11508296 | 6.6% |
| s | 10609925 | 6.1% |
| n | 7763103 | 4.5% |
| m | 7670612 | 4.4% |
| i | 7291113 | 4.2% |
| Other values (16) | 41153260 |
knownForTitles
Text
| Distinct | 5914773 |
|---|---|
| Distinct (%) | 41.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 992.8 MiB |
Length
| Max length | 43 |
|---|---|
| Median length | 42 |
| Mean length | 16.170076 |
| Min length | 2 |
Unique
| Unique | 4889031 ? |
|---|---|
| Unique (%) | 34.4% |
Sample
| 1st row | tt0072308,tt0050419,tt0027125,tt0031983 |
|---|---|
| 2nd row | tt0037382,tt0075213,tt0117057,tt0038355 |
| 3rd row | tt0057345,tt0049189,tt0056404,tt0054452 |
| 4th row | tt0072562,tt0077975,tt0080455,tt0078723 |
| 5th row | tt0050986,tt0069467,tt0050976,tt0083922 |
| Value | Count | Frequency (%) |
| n | 1620890 | 11.4% |
| tt0123338 | 8258 | 0.1% |
| tt22014400 | 7508 | 0.1% |
| tt6168110 | 6382 | < 0.1% |
| tt0441074 | 4882 | < 0.1% |
| tt0072584 | 4305 | < 0.1% |
| tt0159881 | 4068 | < 0.1% |
| tt11874658 | 3905 | < 0.1% |
| tt0479832 | 3898 | < 0.1% |
| tt4202558 | 3624 | < 0.1% |
| Other values (5914763) | 12559420 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 46537780 | |
| 0 | 22939005 | |
| 1 | 21137684 | |
| 2 | 19761831 | |
| 4 | 17136694 | 7.4% |
| 3 | 16568506 | 7.2% |
| 8 | 15932856 | 6.9% |
| 6 | 15672449 | 6.8% |
| 5 | 13793179 | 6.0% |
| 7 | 13492850 | 5.9% |
| Other values (4) | 27081101 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 230053935 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 46537780 | |
| 0 | 22939005 | |
| 1 | 21137684 | |
| 2 | 19761831 | |
| 4 | 17136694 | 7.4% |
| 3 | 16568506 | 7.2% |
| 8 | 15932856 | 6.9% |
| 6 | 15672449 | 6.8% |
| 5 | 13793179 | 6.0% |
| 7 | 13492850 | 5.9% |
| Other values (4) | 27081101 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 230053935 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 46537780 | |
| 0 | 22939005 | |
| 1 | 21137684 | |
| 2 | 19761831 | |
| 4 | 17136694 | 7.4% |
| 3 | 16568506 | 7.2% |
| 8 | 15932856 | 6.9% |
| 6 | 15672449 | 6.8% |
| 5 | 13793179 | 6.0% |
| 7 | 13492850 | 5.9% |
| Other values (4) | 27081101 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 230053935 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 46537780 | |
| 0 | 22939005 | |
| 1 | 21137684 | |
| 2 | 19761831 | |
| 4 | 17136694 | 7.4% |
| 3 | 16568506 | 7.2% |
| 8 | 15932856 | 6.9% |
| 6 | 15672449 | 6.8% |
| 5 | 13793179 | 6.0% |
| 7 | 13492850 | 5.9% |
| Other values (4) | 27081101 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Sample
| nconst | primaryName | birthYear | deathYear | primaryProfession | knownForTitles | |
|---|---|---|---|---|---|---|
| 0 | nm0000001 | Fred Astaire | 1899 | 1987 | actor,miscellaneous,producer | tt0072308,tt0050419,tt0027125,tt0031983 |
| 1 | nm0000002 | Lauren Bacall | 1924 | 2014 | actress,soundtrack,archive_footage | tt0037382,tt0075213,tt0117057,tt0038355 |
| 2 | nm0000003 | Brigitte Bardot | 1934 | \N | actress,music_department,producer | tt0057345,tt0049189,tt0056404,tt0054452 |
| 3 | nm0000004 | John Belushi | 1949 | 1982 | actor,writer,music_department | tt0072562,tt0077975,tt0080455,tt0078723 |
| 4 | nm0000005 | Ingmar Bergman | 1918 | 2007 | writer,director,actor | tt0050986,tt0069467,tt0050976,tt0083922 |
| 5 | nm0000006 | Ingrid Bergman | 1915 | 1982 | actress,producer,soundtrack | tt0034583,tt0038109,tt0036855,tt0038787 |
| 6 | nm0000007 | Humphrey Bogart | 1899 | 1957 | actor,producer,miscellaneous | tt0034583,tt0043265,tt0033870,tt0037382 |
| 7 | nm0000008 | Marlon Brando | 1924 | 2004 | actor,director,writer | tt0078788,tt0068646,tt0047296,tt0070849 |
| 8 | nm0000009 | Richard Burton | 1925 | 1984 | actor,producer,director | tt0061184,tt0087803,tt0059749,tt0057877 |
| 9 | nm0000010 | James Cagney | 1899 | 1986 | actor,director,producer | tt0029870,tt0031867,tt0042041,tt0034236 |
| nconst | primaryName | birthYear | deathYear | primaryProfession | knownForTitles | |
|---|---|---|---|---|---|---|
| 14227130 | nm9993709 | Lu Bevins | \N | \N | producer,director,writer | tt17717854,tt11772904,tt11772812,tt11697102 |
| 14227131 | nm9993710 | Nestor Rudnytskyy | \N | \N | \N | \N |
| 14227132 | nm9993711 | David Gluzman | \N | \N | \N | \N |
| 14227133 | nm9993712 | Corny O'Connell | \N | \N | \N | \N |
| 14227134 | nm9993713 | Sambit Mishra | \N | \N | writer,producer | tt20319332,tt27191658,tt10709066,tt15134202 |
| 14227135 | nm9993714 | Romeo del Rosario | \N | \N | animation_department,art_department | tt11657662,tt14069590,tt2455546 |
| 14227136 | nm9993716 | Essias Loberg | \N | \N | \N | \N |
| 14227137 | nm9993717 | Harikrishnan Rajan | \N | \N | cinematographer | tt8736744 |
| 14227138 | nm9993718 | Aayush Nair | \N | \N | cinematographer | tt8736744 |
| 14227139 | nm9993719 | Andre Hill | \N | \N | \N | \N |